This report presents the analysis of user behaviour for a food products company. The data was collected from the company app, containing information on user actions over the span of two weeks.
One main goal of the analysis is find out where there might be potential barriers to users making a purchase,by studying the company's sales funnel.
We will also examine the results of an A/A/B test, for which the two control groups get the old font of the app, and the experiment group get the new font.
In what follows, we will first import, check and prepare the data for analysis. Then we will take a look at the number of events and users in the logs, as well as the period the data covers. Following this, we will further examine the frequency of events occurrence, user proportion of the events, the sequence of user actions, and user behaviour in the sales funnel.
Finally, we will study the results of the A/A/B test to see if the change of the fonts of the entire app will change the user behaviours. we will first look at the two control groups and see if the splitting of the groups is done properly. Then we will compare the results of the experiment group with those of the control groups.
pip install "notebook>=5.3" "ipywidgets>=7.5" | grep -v 'already satisfied'
Note: you may need to restart the kernel to use updated packages.
#import the packages needed
import pandas as pd
import numpy as np
import math as mth
from scipy import stats as st
import plotly.express as px
#import cufflinks as cf
from plotly import graph_objects as go
from plotly.offline import download_plotlyjs, init_notebook_mode, plot
init_notebook_mode(connected=True)
#cf.go_offline()
from matplotlib import pyplot as plt
import warnings
warnings.filterwarnings("ignore")
%matplotlib inline
Let's get the data! To optimize data importing, we will check the first 50 rows first.
df = pd.read_csv('https://code.s3.yandex.net/datasets/logs_exp_us.csv',
sep='\\t',
nrows=50)
df.info(memory_usage='deep')
<class 'pandas.core.frame.DataFrame'> RangeIndex: 50 entries, 0 to 49 Data columns (total 4 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 EventName 50 non-null object 1 DeviceIDHash 50 non-null int64 2 EventTimestamp 50 non-null int64 3 ExpId 50 non-null int64 dtypes: int64(3), object(1) memory usage: 4.9 KB
#checking if 'EventName' is a categorical column
df['EventName'].value_counts()
MainScreenAppear 27 CartScreenAppear 9 OffersScreenAppear 6 PaymentScreenSuccessful 6 Tutorial 2 Name: EventName, dtype: int64
#checking if 'ExpId' is a categorical column
df['ExpId'].value_counts()
246 21 248 15 247 14 Name: ExpId, dtype: int64
Yes indeed these two are categorical columns. We will change the datatype of 'EventName' and 'ExpId' to 'category' upon importing the data.
#Checking the max value of the timestamp to see if 'int64' is excessive.
df['EventTimestamp'].max()
1564217125
This is only the max of the first 50 rows. However even the unix time of year 2030 doesn't exceed the max int32 value (2147483647). We will change the data to 'int32' as its data type.
Columns look mostly fine, except the two columns that should be of 'category' data type, and the int type for the timestamp column. The columns names are also unconventional. Let's change that when importing the whole dataset.
event_data = pd.read_csv('https://code.s3.yandex.net/datasets/logs_exp_us.csv',
sep='\\t',
names=['event_name', 'device_id', 'event_time',
'experiment_id'],
header=0,
dtype={'event_name': 'category',
'experiment_id': 'category',
'event_time': 'int32'})
event_data.head()
| event_name | device_id | event_time | experiment_id | |
|---|---|---|---|---|
| 0 | MainScreenAppear | 4575588528974610257 | 1564029816 | 246 |
| 1 | MainScreenAppear | 7416695313311560658 | 1564053102 | 246 |
| 2 | PaymentScreenSuccessful | 3518123091307005509 | 1564054127 | 248 |
| 3 | CartScreenAppear | 3518123091307005509 | 1564054127 | 248 |
| 4 | PaymentScreenSuccessful | 6217807653094995999 | 1564055322 | 248 |
Let's now first check the general information of the data.
event_data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 244126 entries, 0 to 244125 Data columns (total 4 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 event_name 244126 non-null category 1 device_id 244126 non-null int64 2 event_time 244126 non-null int32 3 experiment_id 244126 non-null category dtypes: category(2), int32(1), int64(1) memory usage: 3.3 MB
event_data.shape
(244126, 4)
Let's now check for null values and duplicates.
event_data.isna().sum()
event_name 0 device_id 0 event_time 0 experiment_id 0 dtype: int64
event_data.duplicated().sum()
413
There are no null values, but 413 duplicates. Let's take a look.
#getting the first 5 rows of the duplicated data.
event_data[event_data.duplicated()].head()
| event_name | device_id | event_time | experiment_id | |
|---|---|---|---|---|
| 453 | MainScreenAppear | 5613408041324010552 | 1564474784 | 248 |
| 2350 | CartScreenAppear | 1694940645335807244 | 1564609899 | 248 |
| 3573 | MainScreenAppear | 434103746454591587 | 1564628377 | 248 |
| 4076 | MainScreenAppear | 3761373764179762633 | 1564631266 | 247 |
| 4803 | MainScreenAppear | 2835328739789306622 | 1564634641 | 248 |
# Let's check the duplicate for the first row.
event_data[event_data['event_time'] == 1564474784]
| event_name | device_id | event_time | experiment_id | |
|---|---|---|---|---|
| 452 | MainScreenAppear | 5613408041324010552 | 1564474784 | 248 |
| 453 | MainScreenAppear | 5613408041324010552 | 1564474784 | 248 |
Yes indeed. These are identical. We will drop these in the next section.
#Let's also check for duplicates for the device_id and event_time subsets.
event_data[['device_id', 'event_time']].duplicated().sum()
23569
#Taking a look at the first five rows of such duplicates
event_data[event_data[['device_id', 'event_time']].duplicated()].head()
| event_name | device_id | event_time | experiment_id | |
|---|---|---|---|---|
| 3 | CartScreenAppear | 3518123091307005509 | 1564054127 | 248 |
| 28 | PaymentScreenSuccessful | 2029140728621466572 | 1564160801 | 246 |
| 36 | CartScreenAppear | 746235405686708560 | 1564170266 | 248 |
| 38 | PaymentScreenSuccessful | 6805714022805866600 | 1564170267 | 246 |
| 43 | CartScreenAppear | 6075091495307260046 | 1564209665 | 246 |
#Taking a look at the first combo
event_data[(event_data['device_id'] == 3518123091307005509)
& (event_data['event_time'] == 1564054127)]
| event_name | device_id | event_time | experiment_id | |
|---|---|---|---|---|
| 2 | PaymentScreenSuccessful | 3518123091307005509 | 1564054127 | 248 |
| 3 | CartScreenAppear | 3518123091307005509 | 1564054127 | 248 |
These duplicates happen at payment successful and cart, which is actually possible. We will leave this type of duplicates untouched but keep this in mind when working with the sales funnels.
#Checking the categorical column 'event_name'
event_data['event_name'].value_counts()
MainScreenAppear 119205 OffersScreenAppear 46825 CartScreenAppear 42731 PaymentScreenSuccessful 34313 Tutorial 1052 Name: event_name, dtype: int64
#checking the 'experiment_id'
event_data['experiment_id'].value_counts()
248 85747 246 80304 247 78075 Name: experiment_id, dtype: int64
Data checking reveals a few issues we will address next in preprocessing:
In this section, we will address the data issues as identified in the previoius section.
Addressing duplicates
event_data.drop_duplicates(inplace = True)
event_data.duplicated().sum()
0
No duplicates now. Awesome!
Changing event_name values
Let's have a look at these values again.
event_data['event_name'].value_counts()
MainScreenAppear 119101 OffersScreenAppear 46808 CartScreenAppear 42668 PaymentScreenSuccessful 34118 Tutorial 1018 Name: event_name, dtype: int64
We will name the values using the keyword in its lower case.
event_data['event_name'] = event_data['event_name'].replace({
'MainScreenAppear': 'main', 'OffersScreenAppear': 'offers',
'CartScreenAppear': 'cart',
'PaymentScreenSuccessful': 'payment_successful',
'Tutorial': 'tutorial'
})
Now let's check the names again.
event_data['event_name'].value_counts()
main 119101 offers 46808 cart 42668 payment_successful 34118 tutorial 1018 Name: event_name, dtype: int64
Changing the event_time
Let's change the event_time to datetime dataype.
event_data['event_time'] = pd.to_datetime(event_data['event_time'], unit='s')
event_data.head()
| event_name | device_id | event_time | experiment_id | |
|---|---|---|---|---|
| 0 | main | 4575588528974610257 | 2019-07-25 04:43:36 | 246 |
| 1 | main | 7416695313311560658 | 2019-07-25 11:11:42 | 246 |
| 2 | payment_successful | 3518123091307005509 | 2019-07-25 11:28:47 | 248 |
| 3 | cart | 3518123091307005509 | 2019-07-25 11:28:47 | 248 |
| 4 | payment_successful | 6217807653094995999 | 2019-07-25 11:48:42 | 248 |
Columns look good! While we are here, let's also add a event_date column.
event_data['event_date'] = pd.to_datetime(event_data['event_time'].dt.date)
#Checking the general information of the dataset one more time!
event_data.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 243713 entries, 0 to 244125 Data columns (total 5 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 event_name 243713 non-null object 1 device_id 243713 non-null int64 2 event_time 243713 non-null datetime64[ns] 3 experiment_id 243713 non-null category 4 event_date 243713 non-null datetime64[ns] dtypes: category(1), datetime64[ns](2), int64(1), object(1) memory usage: 9.5+ MB
Interesting, now the event_name has gone back to 'object'. It might have to do with when we change the values of the column. Let's change it back.
event_data['event_name'] = event_data['event_name'].astype('category')
event_data.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 243713 entries, 0 to 244125 Data columns (total 5 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 event_name 243713 non-null category 1 device_id 243713 non-null int64 2 event_time 243713 non-null datetime64[ns] 3 experiment_id 243713 non-null category 4 event_date 243713 non-null datetime64[ns] dtypes: category(2), datetime64[ns](2), int64(1) memory usage: 7.9 MB
In this section, we addressed the data issues identified earlier. Now the dataset is ready to be further examined.
In this section, we will analyze the events data to address the following key questions:
How many events and users are in the logs?
What's the average number of events per user?
What period of time does the data cover and whether we need to slice the data for the analysis next.
First let's take another look at what our data looks like now.
event_data.head()
| event_name | device_id | event_time | experiment_id | event_date | |
|---|---|---|---|---|---|
| 0 | main | 4575588528974610257 | 2019-07-25 04:43:36 | 246 | 2019-07-25 |
| 1 | main | 7416695313311560658 | 2019-07-25 11:11:42 | 246 | 2019-07-25 |
| 2 | payment_successful | 3518123091307005509 | 2019-07-25 11:28:47 | 248 | 2019-07-25 |
| 3 | cart | 3518123091307005509 | 2019-07-25 11:28:47 | 248 | 2019-07-25 |
| 4 | payment_successful | 6217807653094995999 | 2019-07-25 11:48:42 | 248 | 2019-07-25 |
First, let's take a look at how many events and users are in the logs.
event_data['event_name'].value_counts()
main 119101 offers 46808 cart 42668 payment_successful 34118 tutorial 1018 Name: event_name, dtype: int64
event_data['event_name'].value_counts().sum()
243713
There are 5 types of events, totalling 243713 occurrences. Now let's have a look at how many users in total, and how many users in each of the 3 groups. We will count the unique users in this case.
total_users = event_data['device_id'].nunique()
control_group_1 = event_data[event_data['experiment_id']
== '246']['device_id'].nunique()
control_group_2 = event_data[event_data['experiment_id']
== '247']['device_id'].nunique()
experiment_group = event_data[event_data['experiment_id']
== '248']['device_id'].nunique()
print('There are a total of {0} users, {1} and {2} for the two control groups, and {3} for the experiment group.'.
format(total_users, control_group_1, control_group_2, experiment_group))
There are a total of 7551 users, 2489 and 2520 for the two control groups, and 2542 for the experiment group.
#Calculating the average event per user.
print('Average number of events per user: ',
(event_data['event_name'].count()/event_data['device_id'].nunique()).round())
Average number of events per user: 32.0
On average, each user performed 32 event actions. Although this gives only a very general picture, considering there are only a total of 5 events, and the data covers a period of two weeks, the app is doing not bad!
Next, let's also take a quick look at the number of users from each of the three groups performing the 5 event actions.
event_data.pivot_table(index='event_name',
columns='experiment_id',
values='device_id',
aggfunc='nunique')
| experiment_id | 246 | 247 | 248 |
|---|---|---|---|
| event_name | |||
| cart | 1270 | 1240 | 1239 |
| main | 2456 | 2482 | 2501 |
| offers | 1545 | 1530 | 1538 |
| payment_successful | 1202 | 1160 | 1185 |
| tutorial | 279 | 286 | 282 |
From the above, we can see that, overall, the main page has the most share of users, followed by offers, cart, payment, and then tutorial last.
This is interesting. while from main, to offers, to cart, to payment is likely to be the sequence of actions many users follow, tutorial is unlikely to be the last user action. The extremely low user number in tutorial across the three groups only indicate the lack of popularity of the page.
We will further examine the sequence of events in the funnel section later in the report.
Let's find out the earliest and latest time of the records in the logs.
print("Earliest record:", event_data['event_time'].min(
), "\nLastest record: ", event_data['event_time'].max())
Earliest record: 2019-07-25 04:43:36 Lastest record: 2019-08-07 21:15:17
The logs covers the time period of approximately two weeks. Now let's have a look at the distribution of datetime in the logs.
fig = px.histogram(event_data, x="event_time",
title = 'Distribution of datetime',
labels = {'event_time': 'Datetime'},
nbins=30)
fig.update_xaxes(tickangle=45)
fig.update_yaxes(title_text = "Count of datetime values")
fig.update_layout(title_x=0.5)
fig.show()
The distribution of datatime is not even. If we hover the mouse to check the values, we can see that the distribution changes from very sparse to stably around 15k somewhere around July 31 and August 1. It is possible that older events could end up in some users' logs for technical reasons, and this could skew the overall picture. We need to find the moment at which the data starts to be complete and ignore the earlier section.
Just to be sure where this should be, let's also plot the distribution of the date.
fig = px.histogram(event_data, x="event_date",
title = 'Distribution of date',
labels = {'event_date': 'Date'},
nbins=15)
fig.update_xaxes(tickangle=45)
fig.update_yaxes(title_text = "Count of date values")
fig.update_layout(title_x=0.5)
fig.show()
Okay. Now we can be fairly certain that it is from August the 1st that the data became complete. Let's disregard the data earlier than this.
event_data_new = event_data[event_data['event_date'] >= '2019-08-01']
# Checking the new earliest and latest time:
print("Earliest record:", event_data_new['event_time'].min(
), "\nLastest record: ", event_data_new['event_time'].max())
Earliest record: 2019-08-01 00:07:28 Lastest record: 2019-08-07 21:15:17
Okay. Now we have the logs spanned over approximately a week. Let's see how much data we have lost.
event_data_new.shape[0]/event_data.shape[0] -1
-0.01159560630741896
We lost slightly over 1% of data. It's not bad. Let's also check event wise, how much data we have lost.
(event_data_new['event_name'].value_counts(
)/event_data['event_name'].value_counts())-1
main -0.014887 offers -0.010148 cart -0.008554 payment_successful -0.005862 tutorial -0.012770 Name: event_name, dtype: float64
For each of the 5 events, we lost approximately 0.5% to 1.5% of the data. This is relatively even and acceptable. Finally, let's have a look at how many users and events we have lost for each experiment group.
#Checking loss of occurrences by event
(event_data_new['experiment_id'].value_counts(
)/event_data['experiment_id'].value_counts())-1
248 -0.011907 246 -0.010963 247 -0.011905 Name: experiment_id, dtype: float64
#checking loss of unique users by event
(event_data_new.groupby('experiment_id').agg(
{'device_id': 'nunique'}) - event_data.groupby('experiment_id').agg({
'device_id': 'nunique'}))/event_data_new.groupby('experiment_id').agg(
{'device_id': 'nunique'})
| device_id | |
|---|---|
| experiment_id | |
| 246 | -0.002013 |
| 247 | -0.002786 |
| 248 | -0.001971 |
For the number of events, the loss for the two control groups and experiment group is very similar and small, so is the number of unique users for each group. The new dataset is in just as good shape!
In this section, we studied the events and users. One thing we've noticed is the sneak in of the older data into the datasets, which we then removed. In the next section, we will examine the event funnel!
In this section, we will address the follow key questions:
the frequency of occurrence for events in the logs
the number of users who performed each of these actions, as well as the proportion of users who performed the action at least once
the sequence of user actions
the share of users that proceed from each stage to the next in the sales funnel
the stage where we lose the most users
the share of users who make the entire journey from their first event to payment
#Calculating the number of occurence of each event in the logs.
event_occurrence = event_data_new.groupby('event_name', as_index=False).agg(
{'device_id': 'count'}).sort_values(by='device_id', ascending=False)
event_occurrence.columns = ['event', 'Number of occurrence']
#plot the bar chart
fig = px.bar(event_occurrence,
x='event',
y='Number of occurrence',
title='Number of occurrence for each event')
fig.update_xaxes(title_text='Events', tickangle=45)
fig.update_yaxes(title_text="Number of event occurrence")
fig.update_layout(title_x=0.5)
fig.show()
We can see from the above that accessing the main page is the most popular event, occuring approaching 120k times. Next is offers, dropping to 46.3k occurrences, then cart (42.3k), and payment (33.9k). The least popular event is tutorials, happening only 1005 times.
#calculating the share of unique users of each event of the total users.
event_users = event_data_new.groupby('event_name').agg(
{'device_id': 'nunique'})/event_data_new['device_id'].nunique()
event_users.columns = ['Share of unique users']
event_users = event_users.sort_values(by='Share of unique users', ascending=False)
#plot the bar chart
fig = px.bar(event_users,
y='Share of unique users',
title='Share of unique users for each event')
fig.update_xaxes(title_text='Events', tickangle=45)
fig.update_yaxes(title_text="Proportion of unique users")
fig.update_layout(title_x=0.5)
fig.show()
From the bar chart above, we get the proportion of the unique users for each event. Those users are those who performed the event at least once. We should note that the proportions doesn't add up to one because the users doesn't necessarily complete just one event.
Similar to the event occurrence we calculated earlier, main page is the most popular among the users, with over 98% of the users. This is followed by offers, acccessed by over 60% of the users. Almost 50% of the users put something in the cart, and almost 47% of the users completed the payment process.
The above gives us a general picture of the popularity of each event among the users. The biggest drop user popularity happens from main to offers. However, the order manifest in popularity is not necessarily the sequence of events. Not all users follow the same sequence of events after all. To find out where the most users were lost, we will need to look at the sales funnels later in this report.
First, let's take another look at the dataset.
event_data_new.head()
| event_name | device_id | event_time | experiment_id | event_date | |
|---|---|---|---|---|---|
| 2828 | tutorial | 3737462046622621720 | 2019-08-01 00:07:28 | 246 | 2019-08-01 |
| 2829 | main | 3737462046622621720 | 2019-08-01 00:08:00 | 246 | 2019-08-01 |
| 2830 | main | 3737462046622621720 | 2019-08-01 00:08:55 | 246 | 2019-08-01 |
| 2831 | offers | 3737462046622621720 | 2019-08-01 00:08:58 | 246 | 2019-08-01 |
| 2832 | main | 1433840883824088890 | 2019-08-01 00:08:59 | 247 | 2019-08-01 |
To get the sales funnel, we will find the time the users first performed each action first.
users = event_data_new.pivot_table(index='device_id',
columns='event_name',
values='event_time',
aggfunc='min')
users.head(10)
| event_name | cart | main | offers | payment_successful | tutorial |
|---|---|---|---|---|---|
| device_id | |||||
| 6888746892508752 | NaT | 2019-08-06 14:06:34 | NaT | NaT | NaT |
| 6909561520679493 | 2019-08-06 18:52:58 | 2019-08-06 18:52:54 | 2019-08-06 18:53:04 | 2019-08-06 18:52:58 | NaT |
| 6922444491712477 | 2019-08-04 14:19:40 | 2019-08-04 14:19:33 | 2019-08-04 14:19:46 | 2019-08-04 14:19:40 | NaT |
| 7435777799948366 | NaT | 2019-08-05 08:06:34 | NaT | NaT | NaT |
| 7702139951469979 | 2019-08-02 14:28:45 | 2019-08-01 04:29:54 | 2019-08-01 04:29:56 | 2019-08-02 14:28:45 | NaT |
| 8486814028069281 | 2019-08-05 04:49:18 | 2019-08-05 04:52:40 | 2019-08-05 04:49:13 | NaT | NaT |
| 8740973466195562 | NaT | 2019-08-02 09:16:48 | 2019-08-02 09:43:59 | NaT | NaT |
| 9841258664663090 | 2019-08-03 10:52:15 | 2019-08-03 10:47:59 | 2019-08-03 10:49:42 | 2019-08-03 17:57:27 | 2019-08-03 10:47:28 |
| 12692216027168046 | NaT | 2019-08-02 16:28:49 | 2019-08-05 04:06:02 | NaT | NaT |
| 15708180189885246 | 2019-08-01 11:06:19 | 2019-08-01 16:08:23 | 2019-08-01 05:38:55 | 2019-08-01 11:06:19 | NaT |
Even a quick look at the first 10 rows show a big variety in user behaviours. Most users start with the main page, except one who start with the tutorials, then some access the offers first before the cart, whereas others access the cart first then offer, or even the payment before offers, which is a bit late to check the offers only then!
Out of the five who made payments, 4 of the users have the same cart and payment_successful time. Without further information about the app, we don't know why this happens. Perhaps the cart and payment completion page are actually one page? In real life, this is worth finding out from the app.
Out of curiosity, before we decide on the sale funnel, let's find out the percentage of users who start with each of the five events. For this mission, we are only interested in where the users started, regardless of the sequence after that.
#First we need to fillna with in order to be able to compare
users_ = users.fillna(pd.to_datetime('2000-01-01 00:00:00'))
users_.head()
| event_name | cart | main | offers | payment_successful | tutorial |
|---|---|---|---|---|---|
| device_id | |||||
| 6888746892508752 | 2000-01-01 00:00:00 | 2019-08-06 14:06:34 | 2000-01-01 00:00:00 | 2000-01-01 00:00:00 | 2000-01-01 |
| 6909561520679493 | 2019-08-06 18:52:58 | 2019-08-06 18:52:54 | 2019-08-06 18:53:04 | 2019-08-06 18:52:58 | 2000-01-01 |
| 6922444491712477 | 2019-08-04 14:19:40 | 2019-08-04 14:19:33 | 2019-08-04 14:19:46 | 2019-08-04 14:19:40 | 2000-01-01 |
| 7435777799948366 | 2000-01-01 00:00:00 | 2019-08-05 08:06:34 | 2000-01-01 00:00:00 | 2000-01-01 00:00:00 | 2000-01-01 |
| 7702139951469979 | 2019-08-02 14:28:45 | 2019-08-01 04:29:54 | 2019-08-01 04:29:56 | 2019-08-02 14:28:45 | 2000-01-01 |
#Percentage of users who started with the main page
len(users_[(users_['main'] > users_['offers']) & (users_['main'] > users_['cart']) & (
users_['main'] > users_['payment_successful']) & (users_['main'] > users_['tutorial'])])/len(users_)
0.400849482346695
Approximately 40% of the users start their use of the app on the main page.
#Percentage of users who started with the offers page
len(users_[(users_['offers'] > users_['main']) & (users_['offers'] > users_['cart']) & (
users_['offers'] > users_['payment_successful']) & (users_['offers'] > users_['tutorial'])])/len(users_)
0.29041677727634724
29% of the users landed on the app first on the offers page.
#Percentage of users who started with the cart page
#We use >= for between cart and payment_successful given our previous observation
len(users_[(users_['cart'] > users_['offers']) & (users_['cart'] > users_['main']) & (
users_['cart'] >= users_['payment_successful']) & (users_['cart'] > users_['tutorial'])])/len(users_)
0.19790284045659676
Close to 20% of the users start with cart in the app, possibly at the same time as the payment (given the '>=' sign).
#Percentage of users who started with the payment_successful page
len(users_[(users_['payment_successful'] > users_['offers']) & (users_['payment_successful'] >= users_['cart']) & (
users_['payment_successful'] > users_['main']) & (users_['payment_successful'] > users_['tutorial'])])/len(users_)
0.2255110167241837
Interesting, slightly over 22% of the users start with the payment on the app, possiblity at the same time as cart (given the '>=' sign).
#Percentage of users who started with the tutorial page
len(users_[(users_['tutorial'] > users_['offers']) & (users_['tutorial'] > users_['cart']) & (
users_['tutorial'] > users_['payment_successful']) & (users_['tutorial'] > users_['main'])])/len(users_)
0.005309264666843642
0.5% of the total users who started at the tutorial of the app.
The information above gives us some idea where the users first landed on the app. Potentially we can track each possible sequence to see which one is the most profitable. However, given the space of this report, we will select to study three funnels. As the main page is where the highest percentage of users first landed, we will first study two funnels:
Out of curiosity, let's also check what happens when users start with tutorials. So we will add another one to examine.
One thing to note also is that the time for payment_successful could be the same as cart, which is possible. So we will take that into account in the calculations below:
Funnel one: main -> offers -> cart -> payment_successful
# Creating conditions to slice the data
step_1 = users['main'].notna()
step_2 = step_1 & (users['offers'].notna()) & (users['offers'] > users['main'])
step_3 = step_2 & (users['cart'].notna()) & (users['cart'] > users['offers'])
step_4 = step_3 & (users['payment_successful'].notna()) & (
users['payment_successful'] >= users['cart'])
# Calculating the users who accessed each page in order
n_main = users[step_1].shape[0]
n_offers = users[step_2].shape[0]
n_cart = users[step_3].shape[0]
n_payment = users[step_4].shape[0]
print('Users who accessed the main page:', n_main)
print('Users who accessed the offers page:', n_offers)
print('Users who added item to the cart:', n_cart)
print('Users who completed the payment:', n_payment)
Users who accessed the main page: 7419 Users who accessed the offers page: 4201 Users who added item to the cart: 1767 Users who completed the payment: 1338
fig = go.Figure(go.Funnel(
y = ["Users who accessed the main page",
"Users who accessed the offers page",
"Users who added product to the cart",
"Users who successfully paid"],
x = [7419, 4201, 1767, 1338],
textposition = "inside",
textinfo = "value+percent initial"
))
fig.update_layout(title_text= 'Funnel one',
title_x=0.5)
fig.show()
AS shown above, for users in this sales funnel, from offers to cart is where we lost the highest percentage of user, approximately 57.9% were lost at this stage.
A total of 18% of the users made it from the beginning to the payment completion.
Funnel two: main -> cart -> offers -> payment_successful
step_1 = users['main'].notna()
step_2 = step_1 & (users['cart'].notna()) & (users['cart'] > users['main'])
step_3 = step_2 & (users['offers'].notna()) & (users['offers'] > users['cart'])
step_4 = step_3 & (users['payment_successful'].notna()) & (
users['payment_successful'] >= users['offers'])
n_main = users[step_1].shape[0]
n_cart = users[step_2].shape[0]
n_offers = users[step_3].shape[0]
n_payment = users[step_4].shape[0]
print('Users who accessed the main page:', n_main)
print('Users who accessed the cart page:', n_cart)
print('Users who accessed the offers page:', n_offers)
print('Users who completed the payment:', n_payment)
Users who accessed the main page: 7419 Users who accessed the cart page: 3424 Users who accessed the offers page: 1476 Users who completed the payment: 254
fig = go.Figure(go.Funnel(
y = ["Users who accessed the main page",
"Users who added item to the cart",
"Users who accessed the offers page",
"Users who successfully paid"],
x = [7419, 3424, 1476, 254],
textposition = "inside",
textinfo = "value+percent initial"
))
fig.update_layout(title_text= 'Funnel two',
title_x=0.5)
fig.show()
For users in this sales funnel, from offers to payment is also where we lost the highest percentage of users (82.8%). Maybe the offers are not atttractive enough! This is followed by the stage from cart to offers (56.9%).
Only a total of 3.4% of the users from the main page made it to the payment.
Funnel three: tutorial -> main -> offers -> cart -> payment_successful
step_0 = users['tutorial'].notna()
step_1 = step_0 & (users['main'].notna()) & (users['main'] > users['tutorial'])
step_2 = step_1 & (users['offers'].notna()) & (users['offers'] > users['main'])
step_3 = step_2 & (users['cart'].notna()) & (users['cart'] > users['offers'])
step_4 = step_3 & (users['payment_successful'].notna()) & (
users['payment_successful'] >= users['cart'])
n_tutorial = users[step_0].shape[0]
n_main = users[step_1].shape[0]
n_offers = users[step_2].shape[0]
n_cart = users[step_3].shape[0]
n_payment= users[step_4].shape[0]
print('Users who accessed the tutorial:', n_tutorial)
print('Users who accessed the main page:', n_main)
print('Users who accessed the offers page:', n_offers)
print('Users who accessed the cart page:', n_cart)
print('Users who completed the payment:', n_payment)
Users who accessed the tutorial: 840 Users who accessed the main page: 787 Users who accessed the offers page: 622 Users who accessed the cart page: 349 Users who completed the payment: 271
fig = go.Figure(go.Funnel(
y=['Users who accessed the tutorial',
"Users who accessed the main page",
"Users who accessed the offers page",
"Users who added product to the cart",
"Users who successfully paid"],
x=[840, 787, 622, 349, 271],
textposition="inside",
textinfo="value+percent initial"
))
fig.update_layout(title_text='Funnel three',
title_x=0.5)
fig.show()
For users who watched tutorials, then accessed main, offers, cart, and payment, where users were lost the most is from offers to the cart (43.9%). As high as 32% of users who watched tutorial ended up paying successfully!
In this section, we studied the events and users. In terms of both event occurrence and proportion of users, the main page has the highest popularity, followed by offers, cart, payment, and tutorials.
We found that that users vary a lot in where they started their journey on the app, and selected 3 sales funnels to calculate the proportion of users who progress from one stage to the next. For all three, we lost the highest percentage of users after the offer page. It's hard to say why but it would help to check the offers page to make sure these are attractive and encouraging, as well as link to the cart or payment page smoothly.
Having said that, we only looked that the users who started with main or tutorials. There are users who started somewhere else, or started with main but made shortcut and nevertheless made payments successfully. It's important to keep this in mind.
Among the 3 funnels, in the one users started with tutorials (funnel 3) we have the highest percentage of users who eventually made the payment (32%). Those who followed the sequence of main, offer, cart, and payment came second in terms of the proportion of users among the inital ones who made payment (18%).
This could indicate that those who would start with tutorials are more keen to purchase in the first place, or the watching tutorials in a way might have facilitated the eventual payment. It's hard to tell. We also need to be aware that those who start with tutorials only represent a very small population among the total users (0.05%).
In this seciton, we will address the following key questions:
whether there is a statistically significant difference between the two control groups in the proprotion of users who performed each of the five actions;
whether there is a statistically significant difference in the proprotion of users who performed each of the five actions between control group 1 and the experiment group, control group 2 and the experiment group, and the control groups combined and the experiment group.
# Checking how many users in each group
event_users = event_data_new.groupby(
'experiment_id').agg({'device_id': 'nunique'}).reset_index()
event_users.name = 'number_users'
event_users
| experiment_id | device_id | |
|---|---|---|
| 0 | 246 | 2484 |
| 1 | 247 | 2513 |
| 2 | 248 | 2537 |
#checking the size difference between the two control groups
(2484-2513)/2484
-0.011674718196457327
Ideally, the number of users in the two the A/A group doesn't vary by more than 1%. In this case, we have the difference just slightly over 1%. It can be considered acceptable.
Next, let's create the subsets from the data containing each of the control groups, the experiement group, and the control groups combined.
group_a1 = event_data_new[event_data_new['experiment_id'] == '246']
group_a2 = event_data_new[event_data_new['experiment_id'] == '247']
group_b = event_data_new[event_data_new['experiment_id'] == '248']
group_a = event_data_new[(event_data_new['experiment_id'] == '246') | (
event_data_new['experiment_id'] == '247')]
Let's also check if the users accidentally show up in more than one group.
users_aa = np.intersect1d(group_a1['device_id'].unique(),
group_a2['device_id'].unique())
users_a1b = np.intersect1d(group_a1['device_id'].unique(),
group_b['device_id'].unique())
users_a2b = np.intersect1d(group_a2['device_id'].unique(),
group_b['device_id'].unique())
print(len(users_aa), len(users_a1b), len(users_a2b))
0 0 0
No overlapping, great!
For the hypothesis testing that the proportions of users who performed the event among the total users are the same between two groups, we will define a function, which passes the name of the event, and which two groups as parameters.
For the test performed by this function, the significance level is set as 0.05.
The null hypothesis is that the two proportions of users performing the actions among the total users are the same; the alternative hypothesis is they are different.
We will first run the test without correcting the significance level.
def test(event, group_1, group_2):
alpha = .05
#creating the array containing the number of unique users who performed the event action
users_event = np.array([group_1[group_1['event_name'] == event]['device_id'].nunique(),
group_2[group_2['event_name'] == event]['device_id'].nunique()])
#creating the array containing the number of total unique users
users_total = np.array([group_1['device_id'].nunique(),
group_2['device_id'].nunique()])
# share of users who performed the event action in group_1:
p1 = users_event[0]/users_total[0]
# share of users who performed the event action in group_2:
p2 = users_event[1]/users_total[1]
# proportion in the combined dataset:
p_combined = (users_event[0] + users_event[1]) / \
(users_total[0] + users_total[1])
# the difference between the datasets' proportions
difference = p1 - p2
# calculating the statistic in standard deviations of the standard normal distribution
z_value = difference / \
mth.sqrt(p_combined * (1 - p_combined) *
(1/users_total[0] + 1/users_total[1]))
# setting up the standard normal distribution (mean 0, standard deviation 1)
distr = st.norm(0, 1)
p_value = (1 - distr.cdf(abs(z_value))) * 2
print('\np-value: ', p_value)
if (p_value < alpha):
print('Rejecting the null hypothesis: there is a significant'
'difference between the proportions of users.\n')
else:
print('Failed to reject the null hypothesis: there is'
' no reason to consider the proportions are different.\n')
#creating a list containing all the event names
event_names = ['main', 'offers', 'cart', 'payment_successful', 'tutorial']
Now we will run the function looping through the list for the two control groups.
#Creating a class which we can use to print bold
class color:
BOLD = '\033[1m'
END = '\033[0m'
for event in event_names:
print(color.BOLD + 'Result of significance test for event - {}'.format(event) + color.END)
test(event, group_a1, group_a2)
Result of significance test for event - main p-value: 0.7570597232046099 Failed to reject the null hypothesis: there is no reason to consider the proportions are different. Result of significance test for event - offers p-value: 0.2480954578522181 Failed to reject the null hypothesis: there is no reason to consider the proportions are different. Result of significance test for event - cart p-value: 0.22883372237997213 Failed to reject the null hypothesis: there is no reason to consider the proportions are different. Result of significance test for event - payment_successful p-value: 0.11456679313141849 Failed to reject the null hypothesis: there is no reason to consider the proportions are different. Result of significance test for event - tutorial p-value: 0.9376996189257114 Failed to reject the null hypothesis: there is no reason to consider the proportions are different.
As shown above, we fail to reject all the null hypotheses above. Given that we are performing multiple testings using partially same data, in theory we should adjust the significance level.
However, in our case, we fail to reject all the null hypothesis already. Adjusting the significance level wouldn't change the result.
The results above confirm that the splitting of the groups is done properly. We can move on to the A/B test.
For A/B test, we will use the same function defined earlier. The significance level is still 0.05, and we will perform the test between B group and each of the control groups, and then between B group and the control groups combined.
Let's first perform the test for events between the first control group and B group.
for event in event_names:
print(color.BOLD + 'Result of significance test for event - {}'.format(event) + color.END)
test(event, group_a1, group_b)
Result of significance test for event - main p-value: 0.2949721933554552 Failed to reject the null hypothesis: there is no reason to consider the proportions are different. Result of significance test for event - offers p-value: 0.20836205402738917 Failed to reject the null hypothesis: there is no reason to consider the proportions are different. Result of significance test for event - cart p-value: 0.07842923237520116 Failed to reject the null hypothesis: there is no reason to consider the proportions are different. Result of significance test for event - payment_successful p-value: 0.2122553275697796 Failed to reject the null hypothesis: there is no reason to consider the proportions are different. Result of significance test for event - tutorial p-value: 0.8264294010087645 Failed to reject the null hypothesis: there is no reason to consider the proportions are different.
We fail to reject any of the null hypotheses above. Now let's run the test for events between the 2nd control group and B group.
for event in event_names:
print(color.BOLD + 'Result of significance test for event - {}'.format(event)+ color.END)
test(event, group_a2, group_b)
Result of significance test for event - main p-value: 0.4587053616621515 Failed to reject the null hypothesis: there is no reason to consider the proportions are different. Result of significance test for event - offers p-value: 0.9197817830592261 Failed to reject the null hypothesis: there is no reason to consider the proportions are different. Result of significance test for event - cart p-value: 0.5786197879539783 Failed to reject the null hypothesis: there is no reason to consider the proportions are different. Result of significance test for event - payment_successful p-value: 0.7373415053803964 Failed to reject the null hypothesis: there is no reason to consider the proportions are different. Result of significance test for event - tutorial p-value: 0.765323922474501 Failed to reject the null hypothesis: there is no reason to consider the proportions are different.
The same happened. We failed to reject any of the null hypotheses.
Finally, let's run the test between the control groups combine and B group.
for event in event_names:
print(color.BOLD + 'Result of significance test for event - {}'.format(event)+ color.END)
test(event, group_a, group_b)
Result of significance test for event - main p-value: 0.29424526837179577 Failed to reject the null hypothesis: there is no reason to consider the proportions are different. Result of significance test for event - offers p-value: 0.43425549655188256 Failed to reject the null hypothesis: there is no reason to consider the proportions are different. Result of significance test for event - cart p-value: 0.18175875284404386 Failed to reject the null hypothesis: there is no reason to consider the proportions are different. Result of significance test for event - payment_successful p-value: 0.6004294282308704 Failed to reject the null hypothesis: there is no reason to consider the proportions are different. Result of significance test for event - tutorial p-value: 0.764862472531507 Failed to reject the null hypothesis: there is no reason to consider the proportions are different.
Comparing the B group against the control groups combined didn't make any difference. We failed to reject all the null hypotheses. Changing the font of the app didn't make any statistically significant different to the share of users who perform any of the event actions.
Technically, we performed multiple testings above: A1 and B, A2 and B, and A and B, and within each group, we performed 5 tests using partially same data. With multiple testing, the probability of making a type I error increases with each new hypothesis test, which is the probability of rejecting the null hypothesis when null hypothesis is true.
In our case, none of the test results in rejecting the null hypothesis. Therefore we don't need to apply any of the methods to correct the significance level.
The siginificance tests in this section show that we can not prove that changing the font of the app will change the user behaviours. We don't have enough evidence to say that the users are intimidated by the new fonts, where were the concerns of the managers.
In this report, we studied the user log data of a food company app over the span of one week.
The investigation of event occurrence shows that the main page is the most popular among the users, whereas tutorial is the least. Users show great variety in terms of where they first landed on the app, and follow different sequence of actions.
To examine the user behaviours, we examined three sales funnels, regardless of the different sequence of actions, the funnels are shown to have lost the highest percentage of users after the offers page. The company will need to investigate the offers page to see if, for example, the offers are attractive enough, or the offers link back to the cart or payment page smoothly.
We also examine the results of the A/A/B test to see if the change of app fonts have changed the user behaviour. Outcomes of the tests show that there is no statistically significant differences in user behaviour towards the old and new version. In other words, we can not say that users are intimidated by the new fonts.